DOCOHEHT HESOMP 

ID 200 906 CS 005 996 



ADTHOH 
TITLE 
PUB DATE 
NOTE 



Hayford, Paul D. i Salter* Ruth 

Eule-Based Measures of Literal ComprehensiDn , 

har 78 

52p.i Paper presented at thy Annual Meeting of tae 
Amarican Educational Research AssQciation (rorcr-to, 
Canada* March 27-3% 1 978), 



IDES PRICE 
DESCEIPTOES 



MP01/PC03 Plus Postage. 

Medsurement Techniquesi ^Reading ComprehensLOn i 
* Reading Tests % *Test Construct loa ; ^lest 
Reliabilityi *Test Theoryi *Teit Valid3.ty 



ABSTRACT 

Reading conprehengion involves a aumber o£ distinctly 
different intellectual skills that can be assessed if the proper 
techniques are eaployed. As part of a reading assessment system* twD 
measures of literal comprehension were developed: the Literal 
Comprehension Details Test (LCDTi and the Paraphrase Readiag Test 
CPRT)* Both the LCDT and the PRT assume the posMision ot visual and 
phonetic skills prerequisite to cotnpreht nsion but do not presume to 
assess those skills necessary for processing beyond the apprehension 
of the eiplicit meaning of the text- The LCDT was conceivsi as a 
battery of test passages* . scaled by difficulty level* with 
accompanying rule^based items for measuring literal comprehensiDn ^ 
whereas the PET was developed for use as a criterion measure of 
literal comprehension. The passages for the PRT are the same passages 
that are used in the Multiple^Choice Cloze Exercises, The basic 
difference between the PRT and LCDT items* then* is that tae PRr 
items involve paraphrase* A study of validity indicates that the L^DI 
has high face validity as a measure of literal comprehensiDn and that 
the face validity for the PRT is higher than that for the PCDT, 
(Appendixes include rules for constructing wh-detail items and rules 
for constructing items for paraphrase reading tests,) (HOD) 
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The assGSSinent of reading comprehension is a significant issuu fur 
educators and educational researchers wishing to rGSpond to the v-zide spread 
concern over literacy and illiteracy. Tiiis paper will make no pretensions 
to solving the problem, but will try to contribute practical inionnntion 
to the continuing discussion. The paper will describe the rationa-i, 
construe t ion 5 and analysis of two tests dcvelopud to ni en sure literal 
comprehension, the Literal Compreliension Details Test (LCDT) and the 
Paraphrase Reading Test (PRT). Both tests were developed by the Bureau of 
School and Cultural Resnarch of the New York State Education Department, 

A Measurernent Fioblem 
One of the fundamental problems in the measurement of any intellectual 
(non-observable) process, skill, or ability is the degree of relationship 
between the intellectual phenomenon and the instrument or method used to 
measure it. In the measurement of mathematical skills, for example, the 
connection is often quite apparent* The relationship between the ability 
to add and the correct answering of addition examples is one of identityi 
there can be no better evidence of the possession of addition ability than 
the correct answering of addition examples. Similarly, the relationship 
between the ability to recall dates of historical events and the provision 
of such dates for a given list of events is also identity. But in the 
measurement of the processes^ skills, or abilities involved in reading 
comprehension, the relationships between the intellectual phenomena and the 
techniques used to assess them are seldom identity* 

For a given reading selection or portion o£ printed discourse, reading 
comprehension may involve a wide variety of skills. Comprehension of a 
particular prose passage^ for example, may involve skills as disparate as 
those related to grainmar and vocabulary^ inferencep propositional logic. 



critical rcnsoning, inetaphoric intcrpr c Ea tion , and rhet:oricnl analysis. 

AsscssmGnt of a roa dor's comprchcmsion of such a pastiage could be a t: Luniptcd 

in a number of ways, but no single method could tap all of the skills involv 

in comprohending the passage* 

A few brief illustrations mnj suggest the breadth of skills and the 

conscqLient types of assessment poteni illy involved yven in a relativyly 

short and uncomplex passagei 

Big Jim had to duck his head to get through 
the entrance. Everybody else had to do 
the saine. 

If we wanted to assess a reader's coinprchension of this passagOj our 

assessment procedure would bo governed by our assessment purposes* If the 

reader whose comprehension we wanted to assess was a pupil in the priinary 

gradess we might use itenis like the following: 

Who had to duck his head to get through 
the entrance? (Verbatim wh- detail ItcJn 
stem) 

Big Jim had to duck his head to get 

through the __ , (verbatim 

completion item) 

What did Big Jim need to do to pass 
through the opening? (paraphrase 
wh-^ detail item stem) 

But if we were Interested in assessing a more experienced reader's 

inferential ability, we might use the following methodi 

Does the above passage contain sufficient 
information to enable us to tell why Big 
Jim and everybody else had to stoop to 
get through the entrance? Answer in 
complete sentences. 

As these examplt.s have tried to suggest, the assessment of different 

types of skills or abilities requires the use of different types of 

assessment devices* Measurement of some kinds of comprehension, at the 



€3xpl±cit level, for exampicij may be accomplished by ot?jcctive itiMns Like 

the three given above, BuU it v;ould be difficult:, if not inipossiblei to 

assess certain higher-level coiitpreheTision skills by menns of objeccive 

items. For in^tnnro^ objectivn if: 0:11 coirld bv cohf r rue t cul to nssi^ss 

inferential skills similar to those mensurcd by the sample item above whinh 

r e qu i r e d a r e s po n s e i n c omp 1 e L e sent en c e s i 

We cannot tell why Big Jim and e^mrybody 
else had to duck tlieir heads to get 
chrough the entrance because « « . 

A possible correct response might suggest the lack of sufficient context 
(Big Jim and the others might be giants^ for example, or they might be 
normal-sised people in a land like LilLiput),, But such an item would not 
be measuring the same skills as the item which required a eomplete-sentence 
answer. To select a valid inference from a list of inferences of varying 
degrees of plausibility is not equivalent to drawing a valid inference. 

The insistence on this distinction may soem like nitpicking, but the 
assumption that such different items measure the same things yields unhappy 
cousequencnri. To wit, either there are no distinctions to be made among the 
intelXectuai (as opposed to the physical) skills related to reading compre- 
hension- or there is no viable method of distinguishing among such varied 
skills as are related to reading comprehension. If either of those assump- 
tions is made (and one or the other must be if we accept the initial premise 
in this paragraph), then it follows that any reading test which presupposes 
decoding will be as good or as useful as any other reading test* Essentially, 
this conclusion would render the use of reading tests futile, for results of 
such tests would be largely unlnterpretable. 

Pursuing this argument further, if we are unwilling to acquiesce in the 
futility Implicit in the previous line of reasoning, then we must contend 
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that readlnf: comprehension involves a numbiar of distinguishably dlfferrnt 
inteliectual skills and that these different skills can be assGSSud if the 
proper tcichniquf s are cmplDyed* The aitn of this paper is r;o describe and 
cvaluntc an af:rcr;pt at Ldcnrifying skiPs o'^ ranges of skills involved in 
reading comprehcnsinn and the a ccompany:lng techniques dcvclopGd to nssess 
the skills so identified. 

As mentioned previously^ the two tests under discussionj the Literal 
Comprehension Details Test (LCDT) and the Paraphrase Reading Test (PRT), 
were both developed as ineasures of literal comprehension* The LCDT, 
developed during 1974 and 1975, was produced as part of a reading assessment 
system coneeived as wOmprehsnsive and mul ti- faceted* The system would have 
included a wide variety of test items for the measurement of a broad spectrum 
of reading skills* Though the system was never completed in the form in 
which it was conceptualized, the LGDT was produced as one of the literal 
comprehension components. The PRT was constructed during 1976 and 1977 
eKpressly as a criterion measure of literal comprehension for use in the 
construct validation of the ^!ui tipl e-Choice Cloze (MGC) Exercises, another 
measure developed by the Bureau of School and Cultural Research to assess 
literal comprehension* 

De f 1 ni ng _ Li tjra 1 Compr eh cnsi on 
Detailed analysis of literal comprehension and eKtended discussion 
attempting to define the term and establish a construct of literal compre-- 
henslon are recorded elsewhere (^^Construct Validation of Multiple-Choice Cloze Exer-- 
cises,'» 1977| Kidder, 1976| Schuder, Kidder, & O'Reilly, 1976| O'Reilly, Schuder, 

& Kidder, 1976)# Rather than attanpting or rapeating prior attempLs at a 
precise technical definition of literal comprehension, the preserit discussion 
will try to describe in a nontechnical way the limits of the range of skills 



involved in literal comprGhcnsion. 

The term literal comprehension can entail tvo separate but rQlatcd 
conCGpts, LitGral comprohension cnn rcfor to the skill or process thG 
fippllcation of: which or dur-lng ^^'hich a render apprjhcnd5 the literal i^c:inLu^ 
of discourse^ the meaning of discourse at its literal level* Or literal 
comprehension can signity the result or product of the process of apprehend-^ 
ing the literal meaning of discourse. 

It is cloar that the acquisition of the product literal CwTOprehension 
implies that the process of apprehending moaning at the literal level of 
discourse has occurred* Howevars as suggested prGviouslyj the process of 
literal comprehension is unmeasurable because it is an unobservablei 
intellectual process a But since there cannot be a literal comprehension 
"producf Independent of a literal comprehension '^processj'' the measurement 
of the product literal comprehension is a direct indicator of the efficiency 
of the process literal comprehension. 

All discourse which has clear and unambiguous meaning possesses a 
literal level* Without a literal level of meanings discourse could have no 
determinate complex, inferential j or higher level of meaning* The literal 
level of meaning, then, is the foundation upon which all other meanings rest* 

The reply niay be made that many words^ and even many sentences, can 
have inuitiple meanings, and that these multiple meanings refute the con^ 
tention that a literal level underlies all other meaning* But such a position 
neglects the context which discourse provides and which limits and excludes 
many possible meanings in favor of a single meaning* (No words or sentences 
with conffnunicative purpose occur independent of context*) The conteKts of 
discour^^w disambiguate potentially ambiguous words and sentences* Consider 
the following sentence i 

We had a ballp 
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it stands, indGpcndent o£ any surrounding context, tlvi scntcncG is 
ambiguous* Its potential meanings include, for example, we held a formal 
da rcGj We had in our possession a round object used in gaines^ or we 
experienced a time of enjoyment. But given context, the ambiguity is 
dissipated* 

We all wanted to play baseball* We had 
a ball. But we had no bat. We were 
frustrated* 

In this examples the context clearly excludes the first and third of 
the potential meanings noted above ^as well as any others) and specifies 
the Second meaning* Indeed, from the context we know not only that ball 
refers to a round object used In gameSs but also that the ball is the kind 
used in baseball gamef i a baseball. 

Context, then, which discourse (as opposed to individual words or 
Sentences) provides, does exclude and disambiiuatej so that where there Is 
clearly specifiable meaning there is a literal level of meaning. 

To anticipate one further objection, it may be argued that some writers^ 
and especially modern writers, have attempted, in poetry and in prose, to 
suggest aspects of their experience which provoke anxiety or frustration or 
seem incoherent. It may be asserted that, in conveying such impressions, 
writers produce passages which have no literal level of meaning. But the 
obvious reply is that there is a distinction to be made between the appearance 
or suggestion of incoherence (consciously rendered and controlled) and 
incoherence itself* When a writer's efforts result in the latter effect-- 
incoherence-^we no longer accord him the title of writeri we merely observe 
that he has (not permanently, we hope) lapsed into incoherence* He has, 
In short, failed clearly to communicatei if he has aimed at producing 
multiple levels of meaning or Interpretation, he has, through the absence 
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of a foundation Cor those Tncanings or inturprrtations, missod his iiuirk* We 
has not providGcI a literal level of moaning* 

Practically speaking, for inuch of tlie printed discourse one encounters, 
the literal level is the only level of incaning* That this is so will be 
evident at a moment^ s reflection. The author of a textbook^ for instance^ 
has as his main purpose the conveying of Infomiation as clearly and directly 
as possible. To achieve clarity and directness^ the textbook autlior does 
not typically employ deception or indirection^ purposeful ambiguity, a 
complex persona^ irony, or other such literary and rhetorical techniques* 
His aim typically is not to call attention to his authorial virtuosity, but 
rather to be as straightforward and uncomplicated as he can. For this 
reason he will try to provide the kind of context which will most efficiently 
exclude unintended meanings and specify intended oncSf, 

To reiterate, for much of what we read, the literal level is the only 
level of meaning, and apprehension of the literal icvel of meaning does not 
call into play complex, inferential, or other higher-level thought processes* 
To be sure, such processes do come into play as we reflect on what we read^ 
but reflection is not a reading skill but a thinking skill. 

The LCDT and the PRT were designed to Tneasure literal comprehension. 
Both words in the term literal comprehension as used to describe what these 
two tests measure are significant. On the one hand, the word comprehension 
implies that the tests are not focusing on eye control, phonetic decoding* 
or othar prerequisites of apprehension or understanding. On the other hand, 
the word literal indicates that the tests are measures of the explicit and 
clearly implied meanings of discourse, rather than of such additional 
meanings as require, for instancej complex or higher-level inferential. 




analytic, synthetic, rhetorical, allusive, or critical reading or thinking 
skills. 

The LCDT and the PRT, then, assume tha possGSSlon of visual and 
phonetic skills prerequisite to comprehension but do not presume to asstjss 
those skills necessary for processing beyond the apprehension of the 
explicit meaning of the text. 

Some e-KaTTiples may help to fix more clearly the limitntions of the LCDT 
ard the PRT as measures of literal comprehension* It is assumed^ for 
instance^ that the two tests measure (or indicate the possession of) the 
kind of skills or abilities required to apprehend the meaning of the follow- 
ing s 

As Mary skipped along the sidewalk^ her 
shoelace came untied* She tripped and 
fell and bruised her knee# 

Literal comprehension of these sentences would entail (1) understanding the 

gramiiatical or syntactic relations among the words, including noun/verb 

distinctions^ verb inflections, and pronominalization| (2) apprehending the 

explicit meanings of the words, including what such words as tory, skipped, 

sidewalk, shoelace, tripped, fell, bruised, and knee mean| and (3) under-* 

standing the clearly implied meanings that Mary bruised her knee because 

she fell and fell be_c_ause she tripped and tripped because her shoelace came 

untied* 

The skills involved in processing these two sentences are the skills 
of literal comprehension. It is clear that such skills do not include 
higher'-level intellectual procGSSing. While it is ai umed that all reading 
comprehension Involves inference au a low level (e*g#, inferring that 
orthography contains potential meaning, that letters in sequence form words, 
that words symbolize sounds, and that particular sounds have particular mean- 
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ings in given contexts)^ it is also assumed that such infcrGnce as is required, 
say, for complcK proposi tional logic is not involved in literai comprehension. 
To take another example, in Keats* s phrase, 

'^No, no^ go not to Lethe . ^ * , 
the literal level involves grammatical and vocabulary knowledge, most of v;}iich 
would be possessed by fairly young school children (with perhaps the exceDtion 
of the relatively unfamiliar Lethe). What the literal level does not involve 
is the kind of additional inferential processing which discovers (from 
context) that there is a speaker addressing someone who has sought a certain 
kind of advice* Also not involved in literal comprehension would be the 
perception of the urgency or emphasis of the first two words* Further^ any 
allusions or suggestions called up by Lethe (the classical ncthcr-world 
river of forgetfulness)--say, to Homer or to Greek my thology--would not be 
part of literal comprehensions 

Similarly, given the following Swiftian product, "I am very sensible 
what a weakness and presumption it is to reason against the general humour 
and disposition of the worlds literal comprehension would involve knowledge 
of the meanings of the words in this contexts Literal comprehension would 
not Involve the perception that this is a curious thing to say (i*e.| apart 
from the possible unfamillarity of some of the words), or the perception 
that the tone of the staterrent needs to be pursued and pinned down* 

Literal comprehensions thenj given the necessary visual abilities and 
orthographic-phonetic knowledge, requires possession of granTmatical rules 
and semantic knowledge. To possess grarmnatical rules signifies the capacity 
to apply the principles which govern the positional and inflectional relation- 
ships among wordSs It does not necessarily entail the ability to state 
formulated rules explicitlys Thus, children can apply graimnatical rules 




without being able to express or define such rules. 

The spmjinLic capricity required for processing discourse involvos two 
kinds of knowledgci (1) vocabulary (or 'V;iictionary'0 knowledge^ the ability 
to recognize a given v/ord, and (2) a script or schema which permits the 
dcterrnina tion of the meaning of a given word in a given context* For 
instance, the word dogs^ isolated from any surrounding or related context, 
has no determinate meaning. Only when dogs is used in a particular context; 
does it take on a definite meanings In the context of greyhound racing, dogs 
might have one meaningi used metaphorically by tired mail carriers, it might 
have another signif icance| the possibilities for varied Semantic- -and 
graTranatical — slgnificance clearly abound (e^g*, '^HatB dogs their flight . * 
The semantic capacity properly to understand any w^ord in a given context 
depends on a script which includes experience of that word in one of its 
particular contexts. If a person possesses graimnatical rules, then literal 
comprehension requires only (1) that he have previously encountered and 
understood a given word and (2) that he have encountered and understood it 
In a context which delineated its meaning as the present context delineates 
it. 

The LCDT and the PRT both appear to be measures of literal comprehension* 
Tliat is, both tests sesn to access the kind of granmiatical and semantic 
knowledge and ability necessary for apprehending the explicit and clearly 
implied meanings of discourse without requiring higher-level skills. Since 
one of the principal purposes of this paper is to evaluate the practical 
potential of these two tests as measures of literal comprehension, the 
following sections will describe the construction of the tests* Then the 
paper will focus on the research studies which have involved the tests* 
The paper will conclude with evaluations of the performances of the LCDT 
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and the PRT, including practical reconinenda tions concerning replicability, 
limitations, and modifications. 

The Literal Comprehension Details Test (LCPT) 
The LCDT was conceived as a battery of test passages, scaled by difficulty 
level, with accompanying rule- based items for measuring literal comprehensiona 
The difticulty levels were Interpreted from readability scores based on the 
Spache and Dale-Chall readability formulas. The passages were to be coherent 
and unified, and their lengths were to vary by difficulty level* The 
relationship between readability scores and difficulty levels, as well as 
approximate passage lengths^ Is illustrated In Table 1# Each passage was to 



Table 1 about here 

be accompanied by 2 main idea items, 2 title items, and 8 wh- detail items. 
Though some passages did not permit the achievement of this goal (e.g#, it 
was not always possible to write 2 title ItOTS according to the formulated 
rules), on most pasBages the requisite number of items could be written, Ttie 
completed corpus consisted of 300 passages, 15 at 20 different difficulty 
levels (spanningj approximately, grades 1 to 10), with usually 2 title ' 
items, ,2 main idea items, and 8 wh- detail items* It should be noted at this 
point that since the rule^based title and main idea items were not used in 
the study to be reported, there will be no further discussion them* 

Pa s sa^e Sou r c e s # Passages for the LCDT were from three different sources* 
Some were produced under contract for the Bureau of School and Cultural 
Research* Some were taken by Bureau staff from a variety of sources, and 
some were written by Bureau staff expressly for the LCDT* Upon reception 
and inspection, the passages produced under contract were found to require 
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Table 1 



Readability Scores, Difficulty Levels, and Passage Lengths 
Literal Comprehension Details Test 





Readability Scores 


Difficulty Level 


Passage Length 


(Words) 


8 


1,0-1*4 


1 


30 




p 


1»5-1,9 


2 


*-nJ 




A 


2.0-2.4 


3 


50 




C 


2.5-2*9 


4 


60 




n 


3*0-3. 4 




■?n 
/u 




£ 


3*5-3*9 


6 


80 






4.50-4*74 


7 


90 




D 


4*75-4.99 


8 


100 




A 


e nn ^ 9 A. 

J. wU— J. t 't 


Q 

7 


1 1 n 




L 






1 'in 




E 


D * jU — J . / »+ 


J. X 


1 j\J 






^ 7 ^ ^ QQ 


1 0 








O. UU-^ □* ^H- 




1 




C 


6.25-6.49 


14 


160 




H 


6.50-6.74 


15 


170 




A 


6.75-6.99 


16 


170 




L 


7.00-7,24 


17 


170 




L 


7.25-7.49 


18 


180 


220 




7,50-7.74 


19 


180 


220 




7.75-7.99 


20 


180 


220 



Expository Narrative 
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significant editing and rewriting, both to attain desired prose standards 
of coherence and unity and to assure accurate readability scores. No passage 
received under contract escaped revision* The sources of these contractually- 
produced passages ware encyclopedias, standardized test passages, and some 
textbooks* Bureau staff took many passages from such sources as literature 
anthologies, newspaperSi magazines, and encyclopadias. Most of these 
passages required little or no editing, while some required moderate editing. 
The original passages written for the LCDT vary, as might be eKpected, from 
accounts of personal experience to academic-infornia tlve-general interest 
passages to rather fanciful pieces# 

The exercise of accumulating a set number of passages of reasonably 
acceptable prose quality at given difficulty levels (i*e», within narrowly 
specified ranges of readability scores) requires considerable discipline* 
Most passages can be taken from the various sources or created without great 
vexation* Randomly-^ selected passages will span a great range of difficulty* 
But when, say, 75 percent of the passages have been collected at the required 
difficulty levels, or when the desired numbers of passages have been gathered 
for given difficulty levels, the exercise of producing (or locating) passages 
for specific difficulty levels can become rather grueling* It is largely for 
this reason that passages taken from printed sources required editlngi If, 
for exampie, an extant passage had a difficulty level of 16 but the passages 
for level 16 had already been produced and more level 15 (or level 17) 
passages were required, then experlnientation with easier (or harder) synonyms 
or with shorter (or longer) sentences had to occur* To alter passages without 
significantly distorting meaning and without producing barbarous prose placed 
great denands on sensitivity and concentration* 

I t^*wrl t ing^ ru 1 e s « After the 300 readability- scaled passages were 
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completed, some experirnentatlon occurred on items which would measure literal 
comprehension and which could be written according to rules. Both the 
rationale for rula-based items and the decision to write wh- detail items 
were based on the work of Bormuth (1970) on achievsnent test items* Essen- 
tially, Bormuth argues for rule-based items on the grounds that only such 
items avoid idiosyncracies and Interpretations introduced by item writersj 
in a word, rule-based itens give some measure of assurance of the avoidance 
of subjectivity or eccentriclCy. 

Even casual Inspection of reading comprehension items on standardized 
achievement tests reveals an enonnous variety In the type and quality of 
items, even of It^s categorized by test-niakers as having similar measure- 
ment properties* The question, of course, is whether obviously-different 
items can be measuring the same thing* Put another way, how can one Interpret 
performnce on such items? 

Rule* based items represent a method of avoiding this uncomfortable 
question, for If the It^-wrltlng rules are clear, it is a simple matter to 
review the items for conformity* Items written acceptably to rules should 
be readily interpretable* 

Bormuth's recommendation of wh- detail items [i*e*, itens with stems 
introduced by the following *^h" words i how, what (noun, pronoun), what 
(verb), when, where, which, who, why] follows centuries of standard pedagogical 
practice* VJh- detail questions are extremely useful for getting at the literal 
meaning of discourse* The writing of the wh- detail items for the LGDT departed, 
in two ways from BorTTiuth' s proposed methodology* Firsts because of time 
constraints, only one of each type of wh* detail item was to be written fot 
each passage! thus, a maximum of eight wh-* detail Items could be written for 
each passage. Secondly, the rules for wr^' ,ng wh- detail items for the LCDT 
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permitted only verbatini items to be writtenp (Bormuth's illustrative 
items occaslonaliy involva paraphrase or substitution of synonyms*) 

For each LCDT passage, thenj eight verbatim wh- detaii items were to be 
written^ Briefly, the procedure involved random selection of a sentence 
from a passage. Given a sentence^ an attempt was made to write a verbatim 
wh-» detail item. The eight types of wh-. words were listed alphabetically, 
and an attempt was Tnade to write the first wh- item type (i.e#^ ^-how") on 
the first sentence randomly selected. If the sentence did not permit a 
'*how" itOT, an attempt was made to write a ^V/hat (noun, pronoun)" item. 
The item writer would try, for each new randomly- selected sentence, to write 
the next type of wh- detail ItOT on the list. (If one was skipped, the item 
writer would return to it on the next sentence.) This procedure permitted 
the production of nearly eight verbatim wh-- detail items for each of the 300 
passages* The rules for writing wh- detail Items for the LCDT are contained 
in Appendix A# 

Each wh- detail Item features a stem. Introduced by the appropriate 
wh^ word, and either 3 or 4 responses (3 for difficulty levels 1-4, 4 for 
difficulty levels 5-20)* One response is the correct answer, and the other 
responses are both granmiatically and semantically plausible* Dependent on 
the passage upon which it is based, a wh** ItCT should not be answc^rable through 
application of test-»wiseness skills. The dlstractors are taken verbatim from 
the passage whenever possible. This Is another precaution Introduced to 
assure that the passages be read before the items are answered* Traditional 
standards for objective Items are also observed in the wh^ detail Items (e.g*, 
avoidance of correct responses standing out because of greater length than 
dlstractors)* All LCDT passages and itCTS were thoroughly reviewed before 
tests were assembled from the passage^itCT battery* 



5 



The^ paraphrase Reading Test 



The PRT was developad for use as a criterion measure of literal compre- 
hension. The occasion for the development of the PRT was a construct valida- 
tion study of the Multiple-Choice Cloze (MCC) Exercises, 1725 modified 



the PRT are the same passages that are used in the MCC exercises. The PRT 

Items are wh- detail Item^ based on paraphrases of the sentences in the (MCC) 

passages* The basic difference between the PRT and LCDT items^ then^ is 

that the PRT items involve paraphrase* 

The need for a construct validation study of the MCC exercises arose 

for several reasons* An earlier effort to validate the MCC, an effort which 

included the use of the LGDTj suffered from the lack of a standardized test 

of sufficient quality and interpretability* Mother significant reason was 

that afcer the first validity study the MCC undewent substantial change, 

including reclo^ing of passages, replacement of many distractQjLS, and renoval 

of titles- Perhaps the most significant reason for the second study was a 

perceived theoretical shortcoming of the LCDT, which was used as a criterion 

measure in the initial study* It could be argued that itCTS on the LCDT could 

be answered by application of such test-wiseness skills as orthographic or 

phonetic matching # In other words, it might be possible to answer LCDT items 

without reading the passages on which they are based* To the extent that 

such test-wiseness skills are fflnployed in responding to the LCDT, the test 

is Invalidated as a measure of literal comprehension* 

Rationale * The use of paraphrase items was based on Anderson's (1972) 

defense of paraphrase as a valid measure of (literal) reading comprehensions 

The argument that paraphrase questions assess 
comprehension is very simple * * #* [l]n order 
to anwer a question based on a paraphrase, a 



cloze passages with accompanying multiple-choice ItCTis* The passages for 
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person has to have comprehended the original 
sentence. Since a paraphrase is related to 
the original sentence with respect to 
meaning but unrelated with respect to the 
shape or Sound of the words# (p. 150) 

Furtherp the rules for paraphrase item writing were derived from Anderson' s 

definition of paraphrase i "Two sta talents are defined as paraphrases of 

one another if 1) they have no substantive words (nouns, verbSj modifiers) 

in coiranon and 2) they are equivalent in meaning'' (p. ISO), Paraphrase^ then, 

was selected for use as the criterion measure in the second construct valida* 

tion study of the MCC. 

Passage sources . As stated, PRT passages are identical to those used 

in MCC exercises* The sources of MCC passages are textbooks and a wide 

variety of other printed materials. Including newspapers, magazines, reference 

books, advertisements, and recipes. The passages are brief, never longer 

than 80 words and averaging 60 to 70 words* They are coherent, but they are 

too short to assure unity in the sense of a beginning, mlddlei and ending* 

The passages are taken as is from their sources, with no --correcting" of 

punctuation or granmiar* Hlnor editing occurs very Infrequently, and then 

only to assure coherence* "Closed" passages average ten deletions or 

blanks, with a multiple-choice ttem for each blank* 

For the PRT, only MCC passages taken from reading or literature texts 

were used# The deleted words were replaced in the blanks, and the passages 

were retyped* Then each sentence, clause, or long phrase was paraphrased, 

and wh- detail items were written for the paraphrases sentences, clauses, etc* 

Paraphrase item writlng^ g The rules for writing paraphrases were derived 

from Anderson's (1972) brief definition* However, paraphrases for the PRT 

were defined somewhat more restrlctively than Anderson had required* For 

the PRT, synonyms or synonymous phrases used in paraphrasing were to come. 



as far as possible, from among words at the same grade level 
as the passage containing tha sentence to be paraphrased. To assure the 
grade level of the paraphrase vocabulary, graded word lists (Harris & 
Jacobsons 1972| Carroll^ Davles, & Richman, 1971) were used, whensver 
possible^ Some givens of paraphrasing included the impossibility of finding 
synonyms or synQnymous phrases for proper nouns, auxiliary verbs, or the verb 
to be* MCC passages, the passages on which paraphrase items were to be 
written, are scaled using the same readability formulas as the LCDT, and 
experience quickly showed that it was not feasible to try to write paraphrases 
for the sentences in passages below difficulty level 7 (l.e*, below grade 4, 
approximately)* The problem with these is that many synonyms for words 
typically found in texts at such levels are too difflculti to use th^ would 
be to increase the difficulty and complexity of the task Involved in responding 
to the paraphrase Itame To do this would be in some measure to Invalidate 
the items as measures of literal comprehensions Such items would place a 
heavier burden on verbal Intelligence than the literal comprehension of the 
passage would require* 

After each sentence, or significant part of a sentence. In a passage 
had been paraphrased, wh- detail items were written on the paraphrases. The 
rules (see Appendix B) for writing paraphrase ItCTis were adapted from the 
rules for writing wh-detall ItOTS developed for the LCDTp The basic differ- 
ence between the item-writing procedures was that there was no restriction 
in the number or type of wh- Items w ritten for the PRT* For every paraphrase, 
all possible wh- detail items were written* The reason for this was that, 
as mentioned above, the passages were very short and as large a pool of Itons 
as possible was desired for each passage to facilitate test construction. 
Though the intention was to control paraphrase vocabulary, to keep it 
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from exceeding the grade level of the source passage, it was very difficult, 
and occasionally not possible, to control paraphrase vocabulary on higher 
difficulty passages- Graded word sources were not adequate, which is the 
same as saying that synonyms for difficult words are often more difficult 
than the words for which they are to be substituted. 

The basic rules for writing items based on the paraphrase differed 
little from the rules for writing wh- detail items for the LCDT. There was 
no requirement that dis tractors be taken from the passage, for example, but 
the greatest difference arose in response to the need to control for itens 
which involved only partial paraphrases* In some cases, sentences, clauses, 
or phrases could not be completely paraphrased- That is, it was not always 
possible to find synonyms (paraphrases) for every content word- Usually, 
a substantial portion of a sentence could be paraphrased, so that there are 
no verbatim (unparaphrased) items, but in some cases either a sten or a 
response may be incompletely paraphrased* When a correct response was 
incompletely paraphrased, distractors were designed to prevent the successful 
eKerclse of such test-*wlseness skills as orthographic matching* 

There were 356 MCC exercises based on passages taken from reading or 
literature texts. The elimination of 122 (from grade 1-3 sources) left 
234 passages. From these, 39 passages were selected randomly* Thus, 17 
percent of the available passages were sampled for the construction of 
paraphrase Items* (Some departure from randomness was necessitated because 
certain passages did not yield the requisite number of paraphrases*) An 
average of about a dozen paraphrase itCTS was written for each passage, and 
the ItCTttS were intensively reviewed both in'-house and by reading professionals* 
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Construction ^nd ^jfatnlstr^t^ of the LCDT 
In the spring of 1975 the LCDT was aAninistered to over 5|000 students 
in grades 1 through 9 in an upstate urban school district as part of a 
validity study of the MCC exercises* There were 36 foms of the MCC and 
36 foms of the LCDT. The passages on the forms were never identicalf and, 
except at the lower grade levels, seldom the same length, but the 36 MCC 
forms were parallel to the i6 LCDT foms* The forms for each test were 
divided into 3 levels, with 12 foms at a level. Students in grades 1-3 
took Level I forms^ students in grades 4-6 and 7-9 took Level II and Level 
111 forms, respectively* The passages on each MCC fom at a test level 
were parallel in difficulty to each other and to the passages on the LCDT 
forms for the same test level* Parallelism was controlled by readability 
scores (difficulty levels)* On each test form, passages were arranged in 
ascending order of difficulty* The difficulty level ranges for the test 
levels of both the LCDT and the MCC are as follows; 

Test Level 

I II III 

Difficulty 

Level 1-10 5-16 11-20 

Ri^.nge 

For construction of the LCDT forms, pairs of difficulty levels were 
combined and their passages pooled in preparation for random selection of 
test passages* Thus, the first passage on each Level II test form was drawn 
randomly from the 30 available passages resulting from the pooling of the 
passages at difficulty levels 5 and 6* A similar procedure was followed 
for the selection of subsequent passages. The only variation from this 
method was at Level I, where difficulty levels I and 2 were discrete sampling 
units, and at Level III, where difficulty levels 11 and 12 were discrete 
sampling units* 

22 

20 



For each LCDT passage on each form^ five wh- detail 1 tarns were chosen, 
for a total of 30 itens per form* The items were selected randomly where 
feasible, but the overriding criteria for Item selection were (1) avoidance 
of mutual cueing and (2) even distribution of wh^ item types. Mutual cueing 
was defined as a stan of one iten cueing the answer to another item# Even 
distribution of item types was achieved for all three test levels* In other 
words, there were not more ''when" questions than *^^rhy" questions, for example, 
across the forms at a test level* A typical LCDT passage, with accompanying 
items, is illustrated in Figure 1# 

The LCDT forms were atainlstered one week after the administration of 
the MCC forms* Means and standard deviations and reliability and validity 
coefficients were calculated for all MCC and LCDT test forms* ^so, data 
from Rasch analyses of the forms permitted some Inspection of deviant Items. 
In addition to the MCC and LCDT data, scores on the California Achievement 
Test (CAT) for students in grades 1-8 and scores on the Short Form Test 
of Acadanlc Aptiti de (SFTAA)* an IQ measure, for students In grades 1*6 were 
obtained* The CAT and SFTAA scores were entered into the validity correlation 
analyses (O^Reilly^ Schuder, & Kidder, 1975), and the SFTAA data permitted 
some factor analyses (O'Reilly & Streeter, i977)# 

As shown in Table 2, means and standard deviations for the LCDT (and 
for the MCC) were quite consistent, thus suggesting that parallelism among 
forms at a test level had been achieved and that the rules for writing wh* 
detail itens had been applied with a high degree of consistency* Kuder- 
Rlchardson Formula 20 reliability coefficients for the LCDT and the MCC are 
reported in Table 3* As illustrated, the average K-R 20 for both tests is 
high, indicating again the Gonslstency of both measures* Validity correlations 
for the LCDT and the MCC are given in Table 4* At test levels 1 and II, 
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During World War lit Britain was defended by an heroic air forces 
but it was difficult to keep the planes aloft* Fuel and spare parts wore 
hard to get, but the worst problem was the fog which usually covered the 
airfields. 

London is known especially for its dense fog. Since the city is near 
the ocean, the moist air seeps over the city and its airports, cools, and 
changes to fog. Before pollution control, smoke from homes and factories 
stuck to the fog which took on the yellow-green of pea soup. This green 
fog made it dangerous for planes to take off or land. 

To keep their war planes flying, the English developed a method for 
clearing the fog from the airports. They lighted oil burners along the 
runways, and warm air rushing upwards carried the fog with it to 2,000 
feet or more. Planes could then fly and carry on the defense of Britain. 

21, Where is the city of Londonf 

A, near the ocean 
B* on seven hills 

C. in Europe 

D, on a wide river 

■ ^ 

22. Wiat kind of fog made it dangerous for planes to take off 
or land? 

hm white ' 

B, green 
Cp grey 
D. dirty 

Figure I* Sample Literal ComprehGnsion Details Test Passage with 
Accompanying Items* 



EKLC 



2-1 



2111308 



23. Who developed a method for clearing the fog from the 
airports? 

'A. the English 

B. the Irish 
G» French 

D, the Dutch 



24. When was Britain defended by an heroic air force? 

A* after the fall of Paris 

B« after the attack on MorrTmndy 

Cm during World War 1 . ' 

i Dm' during World War II 



25. Wiy was it difficult to keep the planes aloft? 

A* because many pilots had bean killed 

B* because bombe were falling on ^e airfields 

' (Cm because of the fog which usually covered the air- 

. fields 

D* because London is near the ocean 



Figure I (Cont*) Sample Literal Comprehension Details Test 

Pas Sage with Accompanying Items. 
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the correiations are moderately high to high| such correiations, demon- 
stratlng the high percentage of shared variance between the two measures, 
give strong support to the conclusion that both tests are measuring the 
same thing (i#e.| literal comprehension)* It may be noted here that factor 
analyges ^ (reported in O'Reilly and Streeterj, 1977) resulted in two 

factors^ which were interpreted as a literal comprehension factor and an IQ 
factor. The MCC and the LCDT loaded heavily on the literal cgmprehenslon 
factor. 



Tables 2, 3^ & ^ about here 

As part of the analysis of the LCDTj an attempt was made to identify 
and study the causGS of item deviance# To date the analysis is incompletej 
but prelitninary efforts attanpted to identify possibly deviant itOTis by 
means of ^-scores* (The ^.-scores were calculated for the items on each 
passage^ using average percent correct of the items on a passage and the 
standard deviation of the passage itens#) Negative 2--scoras lower than 
approximately -1#2 identified apparently or statistically deviant items* 
Perhaps 15 percent of the itans on the LCDT forms were thus identified* 
Inspection of these items^ howeverp frustrated in Mny cases attempts to 
explain their apparent deviance* Some items were clearly and explainably 
deviant* For example^ extrCTe awkwardness of itCT stems and competitive 
(i-e.| arguably correct) dis tractors were among the reasons given to account 
for actual deviance* As stated abovei this phase of the analysis is not yet 
complete* It is expected that the completed analysis of LCDT Item deviance 
will yield generalizations concerning the proportion of explainable deviant 
items and the relationship between z* scores and explainable deviance. 
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Table 2 



Means and SUandnrd Duvla Liuns Cur Lhu MCJCJ aiid Llitj tjUUT 



MCC 



LCDT 



Lcvnl 




N 




S.D. 


' Fotm 


N 


X 


S.D. 


.1 (Grades 


1 


128 


21 ,03 


10.55 


7 


127 


18 /SO 


7.15 ^ 


1. 2, 3) 


2 


126 


20,26 


10.49 


38 


124 


19 .52 


7 * 32 




3 


130 


21.51 


10 ,59 


39 


126 


19 *57 


7.29 




4 


124 


22,44 


1 1 ,34 


40 


121 


19 .02 


7 . 30 




5 


126 


23.06 


11.24 


41 


1 19 


18 43 


7 , 72 




6 


126 


19# 71 


9. 24 


42 


122 


19 . 17 


7,00 




7 


127 


21.47 


11,22 


43 


124 


19.05 


7, 28 




8 


127 


18,84 


10» 40 


44 


124 


19 .20 


7.53 




.9 


129 


21,98 


11,31 


45 


131 


19. 10 


7. 59 




10 


127 


20.47 


10.43 


46 


123 


19. 19 


7.47 




V 11 


120 


23,39 


11.25 


47 


121 


18.65 


7- 18 




12 


123 


22,67 


11.41 


48 


121 


20. 12 


7.69 


II (Grades 


13 


147 


41.46 


11.45 


49 


147 


22.74 


5.57 


• 4. 5. 6) 


14 


151 


40,01 


14. 11 


50 


153 


21.95 


5.46 




15 


153 


38.73 


12,51 


51 


148 


22 .72 


4,85 




16 


152 


40,99 


11 , 62 


52 


152 


22 ,74 


5. 66 




17 


146 


42,18 


12. 60 


53 


145 


23 .52 


5.46 




18 


151 


36, 35 


1 1,03 


54 


144 


23 , 19 


5 .45 




19 


152 


41 .80 


11 48 


55 


145 


22 .96 


5.00 






14r 


42 . 00 


19 08 




14Q 


22 , 60 








IS? 




1 1 "^7 


S7 

D i 


1 LI 


90 Ifi 


fill 

W , A A 






1 %9 




1 ^ ^7 
i J , ^ / 




A HQ 


£ £ , A 7 








1 LR 




1 9 QQ 




1 S7 


Ifs 
^-P t 1 \J 


' SI 










1 % 1,9 


fin 






S 7*^ 


ITT (Grades 


25 


167 


36.60 


12.53 


61 


163 


23.81 


5.51* 


7, 8, 9) 


26 


164 


36.44 


11,69 


62 


162 


23,89 


7.01 




27 


160 


38.86 


14.33 


63 


164 


24,25 


5.79 




28 


161 


40.47 


12.82 


64 


161 


23,89 


4.83 




29 


158 


39.17 


11.52 


65 


165 


23.53 


4.75 




30 


165 


42.54 


13,35 


66 


166 


21.20 


6.20 




31 


158 


39.46 


12,45 ' 


67 


154 


24.88 


4.85 




32 


163 


37.07 


12.01 


68 


163 


22,40 


5.65 




33 


166 


37.38 


11,98 


69 


164 


24,02 


4.99 




34 


159 


38.08 


13.60 


70 


156 


22.01 


5,31 




35 


163 


37.82 


13.18 


71 


163 


23-16 


5.94 




36 


165 


41.82 


12,56 


72 


154 


22,03 


6.90 
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Table 3 



Kudcr-Richardson Farmula 20 ReliablliLy Coefficients 
for the MCC and the LCDT 









MCC 








LCDT 






Level 


Form 


N 


1 


KR- 20 


SE 


Form 


N 


I 


KR-2Q 


SE 


I (Grades 


1 


128 




t "H- 


1.73 


37 


127 


30 


.92 


2.02 


1.2,3) 


2 


126 


41 


.95 


1,64 


38 


124 


30 


.94 


U79 




3 


130 


41 


.96 


1*46 


39 


126 


30 


*90 


2,30 




4 


124 


41 


• 96 


1*46 


40 


121 


30 


• 90 


2,31 




5 


126 


41 


.95 


1.73 


41 


119 


30 


,91 


2,32 




6 


126 


39 


• 95 


1.57 


42 


122 


30 


.91 


2,10 




7 


127 


41 


.96 


1p45 


43 


124 


30 


.93 


1,92 




8 


127 


39 


.96 


1.51 


44 


124 


30 


• 90 


2.38 




9 


129 


41 


.97 


U33 


45 


131 


30 


,90 


2*43 




10 


127 


41 


.96 


1, 49 


46 


123 


30 


.91 


2,24 




11 


120 


41 


- 96 


1p43 


47 


121 


30 


• 94 


1*76 




12 


123 


41 


.96 


1.54 


48 


121 


30 


*92 


2,18 




Median 




.96 


1.49 








.91 


2,21 


II (Grades 


13 


147 


6U 


• 


1.98 


49 


147 


30 


,93 


1*47 




14 


152 


60 


• 96 


2.82 


50 


153 


30 


,93 


1,44 




15 


153 


60 


• 96 


2.50 


51 


148 


30 


,90 


1,53 




16 


152 


60 


,96 


2.32 


52 


152 


30 


.86 


2.11 




17 


146 


60 


• 9? 


2*18 


53 


145 


30 


*93 


1.44 




18 


151 


60 


• 94 


2.69 


54 


144 


30 


• 92 


1,5^ 




19 


152 


60 


• 97 


2.33 


55 


145 


30 


.85 


1, 94 




20 


148 


60 


. 95 


2,69 


56 


149 


30 


* 95 


1,33 




21 


152 


60 


. 95 


2.53 


57 


147 


30 


* 94 


1,50 




22 


152 


60 


• 97 


2.35 


58 


148 


30 


*91 


1*76 




23 


143 


60 


. 97 


2,25 


59 


157 


30 


.94 


1,35 




24 


149 


60 


^95 


2*97 


60 


147 


30 


.93 


1,51 




Median 




.96 


2.35 








*93 




111 (Grades 


25 


167 


60 


• 96 


2.51 


61 


163 


30 


.91 


1, 66 




26 


164 


60 


.95 


2.61 


62 


162 


30 


.94 


1.71 




27 


160 


60 


.96 


2*87 


63 


164 


30 


*96 


1*16 




28 


161 


60 


* 97 


2.22 


64 


161 


30 


*89 


1.60 




29 


158 


60 


• 96 


2.30 


65 


165 


30 


.89 


1*57 




30 


165 


60 


.97 


2,31 


66 


166 


30 


• 94 


1,52 




31 


158 


60 


.95 


2*78 


67 


154 


30 


*96 


0* 97 




32 


163 


60 


.96 


2*40 


68 


163 


30 


.90 


1*78 




33 


166 


60 


.95 


2*67 


69 


164 


30 


.96 


0*99 




34 


159 


60 


• 95 


3*03 


70 


156 


30 


*93 


1,40 




35 


163 


60 


• 97 


2.28 


71 


163 


30 


,95 


1*32 




36 


165 


60 


.97 


2.17 


72 


154 


30 


.95 


1.56 




Median 




.96 


2*40 








.94 


1.54 


Overall 


Median 




• 96 










.92 






Mean 






.96 










.92 






Range 






.94-. 97 








.85 


-.96 





Note. N — number of subjects. 
1 = number of itans. 
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Table 4 

Zero-Order Corrolatlons of MCC Scores with LCDT ScQres 

Test Level 



I II III 

*81 .73 Te? 



Level III correla tions do not include grade 9 da ta* 
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Construction and AdiTilni s tra tion of the PRT 
The PRT was designed, as noted previously, as a literal comprchGnsion 
criterion measure for use in a construct validity study of the MCC* In the 
spring of 1977, the PRT, the MCC^ and three other measures were administered 
to students in grades 3, 6s and 9 in one metropolitan New York district a J 
two ups ta 1 0 districts, one u r ba n a nd o i s u bu r ba n « Tli o s cluj o Is and c 1 n s s g b 
in the schools were selected for their socioeconomic and acadcm_ic representa-* 
tiveness. The three other measures were the Ga tes-MacGinitie Reading Tests-- 
Comprehension (Gates), the Stanford Achievement Test--Readlng Comprehension 
(SAT), and the Degraes of Reading Power Test (DRP)| currently under develop- 
ment In the New York State Education Department. Intercorrela tional results 
of the HCC with all the four other measures may be obtained on request from 
the Bureau of School and Cultural Research. For purposes of this paper, only 
results involving the PRT, MCC, and Gates will be reported* 

Approximately 1,350 students received either the PRT and the MCC, the 
PRT and the Gates, or the MCC and the Gates. The tests were administered 
one week apart. The actual test combinations are listed belowi 



3 

PRT/MCC 

PRT/Gates Primary G 
MGC/Gates Primary C 



Grade 
6 

PRT/MCC 

PRT/Gates Survey D 
MCC/Gates Survey D 



9 

PRT/MCC 

PRT/Gates Survey E 
MCC/Gates Survey E 



The Gates was used in this construct validation study because of its reputa- 
tion as principally a measure of literal comprehension* High correlations 
were eKpected among the three measures| if such correlations were obtalnedp 
they would be interpreted as constituting strong evidence for the validation 
of the MCC as a measure of literal comprehension. Similarlyi high correlations 
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would also -'nlidate the PRT as a measure of litural comprehension; the PRT, 
of course^ has greater face validity than the MCC as a measure of literal 
c am p r f?h on s i o n ^ 

There were three test levels for the PRT and the MCC, and thrue parallel 
test forms were constructed at each level* That is, the passages on the forms 
at each test level shared the same range of difficulty, and increases in dif- 
ficulty from passage to passage were identical* Each PRT test fom had five 
passages, and there were six items for every passage* The six items^ selected 
from the pool of items written for each passage, were chosen on the basis of 
two criteria I (1) avoidance of mutual cueing and (2) quality (□•§*, avoidance 
of awkwardness and ungrammaticalness) * A PRT passage, with its items, is 
illustrated in Figure 2, 

Means and standard deviations, Kuder-Richardson Formula 20 reliability 
coefficients, and Pearson Product-Moment correlation coefficients were 
computed for the three tests and are reported, respectively, in Tables 5, 
6, and 7« 

The means and standard deviations for the PRT suggest a good deal of 
consistency across the test forms, which in turn implies a degree of success 
in applying the item-writing rules and in attaining parallelism among test 
foms* (For future reference the relatively low standard deviations for the 
PRT forms at grade 9 should be noted here,) The very high K-R 20 »s for both 
the PRT and the MCC are evidence of the internal consistency of both measures 
and of the consistency of student responses to the PRT and MCC formats. 
The correlation coefficlants are also high, as eKpected, especially at grades 
6 and 3* At grade 6 the correlations Indicate that approximately 70 percent 
of the shared variance for the three test combinations is accounted for by 
the same trait, i#e., literal cmnprehenslon. These corfelations give strong 
support to the contention that all three tests are measures of literal 
comprehension- 
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The two gods took on the appearance of poor wayfarers and wandered 
through the land, knocking at each lowly hut or great house they came 
to and asking for food and a place to rest in. Not one would adroit 
them} every tima they were dismissed insolently and the door barred 
against them. They made trial of nundreds| all treated th^ in the 
same way* 




63/ What did the two divine beings do? 



1# pretended to be lost and sick 

2. assumed the likeness of needy travelers 

3# acted like conTOOn gentlemen 

4. appeared as great and worthy citizens 



64^ Where did the two divine beings roam? 

1* throughout the country 

2# throughout their palaces 

3. everjns^here but the market place 

4# only in the forest 





65) What did the two divine baings request? 



1. somewhere to wash and rooms to sleep in during the night 

2. water to drink and a place to clean up in 
3# a place to pray and some water to drink 

4, something to eat and a spot to pause and valaH in 



66 ) What would nobody do? 



1* turn the two gods away 

2m let the two gods in 

3, admit that they had roOTi and food to spare 

4# permit the two gods to leave 

Figure 2, Sample Paraphrase Reading Test Passage with Accompanying Items, 
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f67) What did the two divine beings do at every humble shanty or 
fine mansion they arrived at? 

1* rang the doorbell 

2, rapped on the door 
3# stared in the window 
4* stood by the gate 

(^^68^) How did everyone behave toward the gods? 

1* courteously 

2# similarly 

3, pleasantly 

4, diffarently 



Figure 2 (Cont«) Sample Paraphrase Reading Test Passage with Accompanyini 

Items* 
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Tabls 5 



Means and Scandard Deviadons for PRT, MCC; mAMjI 
by Grade Lavel and Test Combination Grouj 



!RT 



MCC Gates 

Group Group Gombinsd 



321 19,5(6,0) 20.0(6.3) 19,8(6,2) 

322 18.3(6.8) 19,5(7.3) 18.9(7,1) 

Nl P62 M43 

323 17,6(6,4) 18,0(5,7) 17.3(6.1) 

NO - Ml PUl 



621 22.2(6.8) 23.1(4,7) 22.6(5,8) 

Nl NO un 

m 19.4(5,5) 18,9(5.7) 19,2(5.5) 

N4 m f^l39 

623 18.8(7,0) 19,6(6.2) 19,2(6.6) 

P76 W9 M35 



921 


18,9(4.3) 


20.6(4.2) 


19.8(4,3) 




N7 


ml 


H58 


922 


19,4(5.0) 


19.6(4,9) 


19.5(5.0) 




N9 




M44 


923 


17,4(4.7) 


19,5(5.2) 


18,5(5.0) 






Nl 


N53 



MCC 



PRT Gates 
Form Group Group Combined 



Grade 3 



311 36,8(10.0) 34,2(11,6) 35.5(10,3) 

P75 M48 

312 33,2(11.2) 30,4(12.4) 31,8(11.8) 

Nl N9 f^lfiO 

313 33.7(11.4) 31.4(11.5) 32.6(11,7) 

NO N«75 N=155 



&ade 6 



611 40.3(10.3) 39,2(9.3) 39,8(9,8) 

Nl M5 U% 

612 36.1(9.9) 33.0(10.0) 34.6(9.9) 

N4 P11 W61 

613 35.0(10,6) 35,1(8,8) 35,1(9,7) 

P76 ■ N=151 



Grade 9 



911 40.2(8.0) 40,3(8.5) mihl) 

m N9 ^^^5 

912 41.4(9.8) 41.4(8,5) 41.4(9.2) 

N9 f^^^^ 

913 36.0(9,1) 36.0(8.0) 36,0(8.6) 



(jflCes 



PRT MCC 

Group _StQ^P Coniblned. 



33.3(9.5) 29.7(12.3) 31,5(10.9) 
' Ma4 N27 



40,7(8.6) 39,6(9.3) ^.2(9.0) 
I^IH !^227 



39,5(10.2) 38,0(10,5) 38.8(10,4) 
E^227 NIO 1^437 
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Table 6 



Ku d er - Ri c ha r d s 0 n 


Fo rmu 1 a 


20 Reliabilit 


y Co e file 


ients 


for the 


Multiple-Choice Cloze 


Test 


; and 


Parnphrase R 


eading Tc 


St by 


Grade Level 


Multiple-Choice Cloze 




Paraphrase Rea 


ding T 


est 


Form N 




K-R- 


20 Form 


N 




K-R-20 


Grade 3 


. 311 321 




.97 


321 


153 




.96 


312 329 




.95 


322 


157 




.96 


313 314 




.96 


323 


152 




.90 


Average K-R-20 == .96 






Average 


K-R-20 = 


.94 




Grade 6 


611 305 




.97 


621 


152 




.97 


612 320 




.96 


622 


155 




.90 


613 299 




.94 


623 


146 




.95 


Average K-R-20 - .95 






Average 


K-R-20 == 


.94 




Grade 9 


911 336 




.98 


921 


173 




.94 


912 332 




.97 


922 


171 




.91 


913 336 




.95 


923 


182 




.92 


Average K-R-20 == .97 






Average 


K-R~20 = 


.92 
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TabLe 7 








Zero-Order Gorrelatlons 




Grade 


PRT-MCC 


PRT- Gates 


MCC- Gates 


3 
6 
9 


.80 
,84 
.68 


.79 
.8J 

,m 


.76 
.84 
.76 
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Table 7 

Zero-Order Corr-ala tions 



G-^dQ PRT-MCC PRT- Gates 



MCC- Gates 



3 .80 .79 

6 .04 ,83 

. 68 , 48 



9 



.76 
.84 

a 76 



As with the LCDT* part of the analysis of the PRT results involved 
examination of deviant test items. Deviant items were tentatively identified 
by means of z-scores^ and the items so identified were inspected for the 
sake of determining the causes of actual deviance. Inspection of the PRT 
items is as yet incomplete, but praTiminary findings indicate that PRT items 
are deviant in slightly higher proportions than are LCDT items. Discoverable 
causes ol. PRT iteru deviance seem closely related to problems involved in 
niaking paraphrases. Further study will attejnpt to determine the relation- 
ship between statistical deviance and explainable (actual) deviance. 

Discus_sion 

LCDT 

The LCDT has high face validity as a measure of literal comprehension* 
Its items require no propositional inference, no drawing of conclusions^ no 
analysis or synthesis of ideas* The test data confirtn the consistency of 
the measure and of the application of the rules for writing wh- detail items. 
These two points^ in concert with moderately high correlation coefficients 
and factor analytic findings, provide fairly scrong grounds for t2ie valida- 
tion of the LCDT as a measure of literal comprehension- Certainly the LCDT 
battery of passages and Items represents a resource of high potential. 

The LCDT does have on? possible shortcpming. Because Its itmns are 
verbatim Itens, test-wise students may answer them correctly without reading 
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or comprohcnding the passage by a process of orthogrnphic or phonotic match- 
ing. (The slightly transCor;:iGd stems and correct rosponses nviy bn lucatod 
in the passage,) It is for future study to determine the extent to wldch 
such test-wisenGSS techniques are employed under actual test^taking conditions. 
It- would seem unlikely that test-wiseness would come into play frequently 
enough to invalidate results for a single test administration. The 
practical question Is whether test-wlseness could increasingly become a 
factor across several test aininistra tions, say^ in an achieveTnent monitoring 
design* 
PRT 

Face validity for the PRT is higher than for the LCDT* There is no 
problem of orthographic-phonetic matching with PRT it^is. The consistency 
of the PRT and of the application of its item-writing rules is attested to 
in the data. These factorSs combined with high validity ccef f iclentSj 
provide very strong support for the PRT (and also for the MCC and the Gates) 
as a measure of literal compr Ghenslon. 

The relatively low correlation between the PRT and the Gates for grade 
9 warrants conmient. Part of the explanation is statistical. The distribution 
of Gates scores for grade 9 are positively skewed and the variability of 
scores on the PRT is somewhat less for grade 9 than for grades 3 and 6* These 
two factors partially explain the relatively low correlation. Much of the 
correlation is explicable In terms of shortcomings of the PRT forms for 
grade 9^ however. 

Two problenis occurred in writing the paraphrase Itens for the grade 9 
formsi the problOTS did not occur exclusively at the grade 9 level^ but they 
were mors pervasive at that level. One problem Involved the writing of 
paraphrases and the other Involved the Increasing length of Item scots and 
responses. 
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An effort was r 'e to control paraphrase vocabuli^ry so that: it did not 
GxcGod the grade level of the passage Pource^ This co' id be done fairly 
consistently on passages at grade 6 and below; available graded word lists 
facilitated vocabulary control for the paraphrasing of these passages. But 
for passages taken from sources above grade 6, available graded word lists 
were inadequate as sources of synonyms. Words which would serve as acceptable 
synonyms did not appear on the graded word llstSe Thus, paraphrase vocabulary 
increased in difficulty on passages above grade 6j and the proportion of such 
passages was much higher on the grade 9 forms. 

The second problem^ Increasing length of item stems and responses, was 
a function of the more difficult passages which appeared on the grade 9 
forms* By dafinitlon, more difficult passages feature higher proportions of 
long sentences. Paraphrases of long sentences will themselves be long. 
And greater stan and response length contributes to greater item difficulty. 

In other words, application of the paraphrase technology in producing 
Itms for the grade 9 forms elevated the difficulty of the items on those 
forms* One furthar pieca of evidence Illustrating ths problem with the 
grade 9 forms lies in the relationship between PRT and MCG test forms* For 
grade 9 the PRT forms were relatively much more difficult in comparison to 
the MCG forms than they were at grades 3 and 6. This additional evidenc^G 
further confirms the increased difficulty of the grade 9 PRT forms* The 
relatively low grade 9 correlations between the PRT and the Gates (and 
even the somewhat lower correlations between the PRT and the MCC at grade 
9), then, can be largely understood as the result of problems In the appllca-» 
tlon o£ the paraphrasing and itOTi-writing rules* 
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Conclusions 

Hhm findings of this Investigation into the feasibility of producing 
rule-based measures of literal comprehansion are very positive* Application 
of the rules developed for writing both verbatim wh- detail items and para- 
phrase itens was successful* The rule-based itens permitted construction of 
test forma with high degrees of consistency and reliability and strong 
evidence of validity* 

Neither the LCDT nor the PRT was without problans, however* The 
verbatim wh' detail ItCTis of the LCDT are open to the charge that they can 
be answered by the application of test^wlseness skills* No obvious solution 
to this problem comes inmedlately to mind* As stated above^ further research 
might profitably investigate the extent to which such test-taking skills 
contaminate test results* Also, future investigation could be applied to 
the solution of the test-wlseness problem* 

The problem with the PRT, that the items became disproportionately 
difficult on the upper-grade-level passages^ is not insoluble* In fact, 
the problen is at least as much attributable to the constraints upon ItOTi- 
writing imposed by the brief passages used on the PRT fonns as it is to the 
paraphrase item technQlogy, The obvious solution to the problem is to write 
paraphrase items on longer passagesj for example, the passages on the LCDT* 
Longer passages would pennit much greater flexibility In the writing of para- 
phrase items because they would contain more sentences for which acceptable 
paraphrases could be written* With the short passages used on the PRT, para- 
phrases had to be forced for the sake of accumulating six items per passage* 
With longer passages and more fleKibility in test construction^ poor quality 
paraphrases would no longer have to be written* 

Whether the use of longer passages would permit the extension of the 
paraphrase technology to passagei from sources below grade four is conjee- 
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tural. It would seem that the paraphrase writing rules could be applied to 
longer passages even at such low grade levels# Lengthened passages might 
also alleviate the problem of controlling for vocabulary difficulty at upper 
grade levels* 

Several practical recoiranendations arise from this analysis of the LCDT 
and the PRT. The first recorranenda tlon is that the LCDT be used| It is an 
extant resource which could serve in achleveinent monitoring designs, for 
examplei or it could be used Instructionally if teachers had It to use# A 
corollary of this recoTranenda tlon is that the range of the passages, presently 
20 difficulty levels, be extended at least to 26 difficulty levels to increase 
the test's utility for upper-'grade students, many of whom would quickly top 
out on the extant passages* 

Another recommendation Is that given the length of the LCDT passages^ 
they would be very suitable to the application of paraphrase ItOTi technology. 
Paraphrase items should be written for LCDT passages, then| If this suggestion 
were followed, all possible paraphrase items should be written on each 
passage* Such Itemi, with their superior face validity, would constitute 
an extremely valuable resource for the measurement of literal comprehension* 

The original design of the LCDT called for a maximum of eight wh* detail 
items per passage* It is here recoTtmended that the number of Items be in- 
creased by the writing of all possible verbatim wh-» detail ItOTis on each 
LCDT passage* (The task is a finite one if the Items are verbatim^) The 
larger pool of itms resulting from this exercise would greatly Increase the 
flexibility and utility of the resource- 
It is clearly more difficult to write paraphrase Items than it Is to 
write verbatim wh- detail Items* The rewards are greater, though, and this 
should be kept in mind If such options are ever seriously considered* 
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<^e final remark. In the application of the paraphrase iten-writing 
ruleSi care should be taken to avoid forcing paraphrases where no adequate 
ones present themselves* Forcing could result in either unconscionably 
awkward or barbarous -sounding paraphrases or paraphrases which grow increas* 
ingly metaphorical* Either excess has an invalidating effect on the para- 
phrase item as a measure of literal comprehension^ Judgment and sensitivity* 
then, must be exercised in the application of item-writing rules (and In 
the review and selection of items for test form construction)* 

There is much to be said for rule* based approaches to the measurement 
of reading comprehension, but one must be wary of the temptation to assume 
that reading comprehension measures can be completely automated or mechanised* 
Labor under such delusion must surely conclude In frustration* 

Members of the research coTtimunlty Interested in pursuing these suggestions 
rmy have access to the materials already prepared and make use of the rules 
accompanying this paper* 
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APPENDIX A 



RUUS FOR CONSTRUCTING WH- DETAIL ITEMS 



WH- Detail Items 

Format I Levels 3 reBponeee 

Levels 5^20, k re^onsee 

1. Given a passage I 

2* Randomly take a sentence number from a permutation block 
representing all possible eentenoes in the passage (in this 
oasef 1-*l6)« 

2-1* Take numbers from left to ri^t across the block and 
so on down throu^ the entire block if necessary; if 
block is exhausted before the passage i use next block| 
always start a passage with a new block- 

2.2- If number taken from block does not rtprtsent a sentence 

in the passage (e*g. , 15 when there are only 10 sentences)* 
take the next number- 

3, Starting at the top, take a detail question t^e from the 
following alphabetical list (see attachment for illustrative 
examples of detail question types) i 

HOW 

WHAT —noun I pronoun 

WHAT- -verb 

WHSN 

WHME 

WHICH 

WHO(M) 

WHY 

4. If possible, write the detail question about the sentence 
taken in I- 2- 

4*1 » Write clear, concise questions in colloquial En^ish, 

changing the wording of the sentence as little as possible - 
(Exceptions replace pronotms with their referents*) 

4*1. a. Begin each question with the appropriate detail word 
(e*ge, how, what, eto^)- 

1 

4*2- Avoid anaphora whan poseible- 

2 

4-3* Avoid inference. 

4.4. Ask each detail question only once per passage. 

4,5 • If possible, all 8 detail questions of each passage. 
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kmSm Ask only one detail qutstion per eentenee unless the 
sentence or passage is rieh in detail and there are 
few sentences, in which caee repeat I* 2. from a new 
permutation block until all 8 wh-^questions have been 
a^ed if possible # 

If the detail question caimot be asked of the sentence taken 
in I. 2* (e#g*f there is no answer to a "how" question), 
go on to the next detail question until a detail question is 
a^ed of the sentence if possible* 

5*1* If a detail question cannot be asked of a given sentence, 
return to that same detail question first on the next 
sentence taken (e.g., if "how" is skipped, retura to 
'^ow" first on the next sentence)* 

Take the next sentence number in the permutation block and ask 
the next detail question until all the detail questions are 
exhausted if possible (Some passages may not be rich enough 
in detail to provide bases for all eight detail question tj^es,)* 

If possible, take the distraotors from the passage verbatim. 

7*1* Write only grammatically and semantically plausible 
distractors* 

7*2. Write parallel distractors when possible* 

7*5* Write distractors that closely match the correct response 
in number of words* 

7 Am If distractors are not parallel or equal in length, write 

at least one distractor that parallels or matdies in length 
the correct response* 

7*5* Write no distractors that could be correot in the context 
of the passage* 

7.6* Write distractors that are appropriate to the level of the 
passage. 

If distractors cannot be taken verbatim from the passage, 

8*1* Take distractors from the passagei changing thera as little 

as possible in order to m^e them parallel and grammatically 
and semutically plausible (e.g., add deterainers, adverbs, 
aubordinatorsi etc*| or change verb tensa, nwiber, etc*i 
delete words| join words from scattered places in the 
passage). 

8*2* If parallel, plausible distractors cannot be foimd in the 
passagei or if mich distractors make the correct response 
debatable, t^e distractors from outside the passage* Such 
distractors must meat all the criteria in !• 7*1 • to 
I. 7*fi* above* 



43 46 



FDotnotes 



The referent for a pronoun may be in preeeding sentencea. Adverbs 
like ''soon" or "then" may refer to actions or situations in preceding ean- 
tenoasp 

^he only exceptione would ba passages where the logisal relationdiip 
between two or more sentencas is olaarly implied* For example* "Carmen is 
writing to her friend, Carlos, Next Saturday will be his birthday." Why 
is Carmen writJAg to Carlos? Beoai^e next Saturday will be hie birthday* 
Because is not in the passage but is lo^cally and clearly implied as an 
expression of the relation^ip between the two sentences, 'Tim, the turtle, 
has a new shell* He is very happy." Why is Tim happy? Becauee he has a 
new ^ell* 
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Illustrative WH- Detail Iterae 











£#Kampie A. 


How 


Ad verb! a 1 


W» >llWW lllSlljf ■ • • • 


A 

Aa 


p mj J e t c a 








A 

A* 


very t al 1 






tree? 










Q. How are shoes made? 


A. 


with leather 






Q. How did the brook 


A. 


rapidly 














Verb 




A 


KlL 4- V€ 3 






to school? 








Adjectival 


Q, How did Mary look? 


A. 


sadj happy, pretty. 










etc* 


What 




V£« nita.L uiu ^iin ne Qui 


A 

n m 


he Ip 






wnat aid jonn eat? 


A 

Aa 


lunch, icfij crearaj it 






Q* What swam fast? 


Aa 


the fish 




veto 


Kim wnac uiu I im aOf 


Aa 


ranp ate, slept^ 










fellp etc. 






wnac uoes Jane oof 


Aa 


slngSs laughs, etCa 








A 

Aa 


thinking ^ talking ^ 






doing? 




etc. 




f^bl vet UXcl^ 




A 

Aa 


when the steam in* 




result 


corn pop? 




side expanded 




Advarbial- 


Q. When did the boys 


A. 


in the evening^ after 




t lue 


COA6 hone? 




school, at 4 o'clock^ 










e t C a 


n 4l€ F€ 




Qa Where did jack go? 


Aa 


for a walk, outside. 










to town, to New York 


Which 


Adjectival 


Qa Whose cat was it? 


A 

4* a 


Tom ' ^ M/5i^v'^ Tnhn ' Q 

^ W|L| "9 ' ^ JF ^ S ^ u till S 






Q, Which hat did .Davy 


Aa 


coonskin, blue, floppy. 










big 








A 

n m 


nswj o iQ , aircy 
















Q. What color was 


A. 


blue, red, white 






Bill's shirt? 






fffilW 




\jm wno p^&yeQ uaXLi 


A 

Ai 


Herbie, the boys, the 










players, he, they, etc* 




firiiiTi sl^flfiHi fiff 


• n iiwiu U J- U U lie L 


A 

A a 


fl^tPie, LneiDj nxin, nst. 




for person) 


hit? 




Mary, etc. 


Why 


Adverbial- 


Why did Tom trip? 


A. 


because his shoes 




causep expli* 






were too big 




elt 










Ifflplieit 


Q* Why did the ice 


A. 


The sun got very 



melt? hot. 



ERIC 



45 48 



APPENDIX B 



RULES FOR CONSTRUCTING PAR^HRASE 
ITEMS FOR PAM ACHIEVEMENT MONITORS 

I* Paeeage Stleation 

A. Dttarmlna range of diffiaiilty tor test forms* 

1. Identify each difficulty level in the Readin^Literature 
MCC Exaroises from which passages will be drawn* 

2. Draw randomly the requisite nmiber of exercieee at 
each difficulty level. 

3* Replace deleted words in blanks in each MCC exercise 
drawn* 

II* Paraphrasing Selected Exercise Passage 

A* Number eaoh sentence in every exercise passage* 

1s In passages with compoimd sentencesi number each main clause* 

2* la passages with complex sentenoee, number each main clause t 
subordinate clause, and long modifying phrase. 

B* Paraphrase^ each numbered senttnce or clause. 

1* If possible I replaoe all substantive words (nouns, verbs, 
modifiers') with synonyms^ (i.e., equivalent words or 
phrases). 

a* Consist when necessary a diotionaryi thesaurus, or 
dictionary of synonyms* 

b. Consult other relevant rafereace words as necessary. 

2. Proper notms and pronouns often OMnot be paraphrased* 

3, Auxiliary verbs and the verb to be cannot always be paraphrased i 

km It possible, paraphrase vocabulary shoid^d not exceed tba 
vooi^bulary levd. of the passage (as deteiniined by difficulty 
level)* 

a. Consult Harris and Jacobeon, 1972, when necessary* 

b* Consult Carroll, Davies, and Richmant 1971 i when necessary. 

5* Retain meaning of orip^nal sentence (i.e., vocabulary and 

syntax of paraphrase shotd,d not involve sipiificant alteration 
of the literal meaning of the original sentence). 



Rules for paraphrailng are based on toderson's (1972) definition of 



Q paraphrase* 
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Cm Flexibility in the writing of paraphrases is illUBtrated belowi 

1. A paraphrase does not have to have the exaet number of words 
as the original sentenoe; it may be sli^tly longer or shorter, 

2. Syntax may be altered in various ways. 

a. Order of clauses or phrases may be changed as long as 
literal meaning is retained* 

b. Voice of verba may be changed (e*g-i active to passive). 
Cm Phrases may replace single words (and vice versa)* 

III. Writing Items for Paraphrased PassageaS 

A. Write WH-detail items on each paraphrased sentence, clause, or 
phrase* Adhere as much as possible to the following riJ^ess 

1* Write clear, concise questions in colloquial English, Ghanp.ng 
the wording of the paraphrase as little as possible* (Excep-- 
tions replace pronouns with their referents.) 

2* Begin each question with the appropriate detail word 
(e.g*, how, what, when, where, etc*). 

3* Avoid writing inferential Vffl-detail items (e.g., do not write 
a ''why" item unless the causal relationship is either es^licit 
or clearly implied in the text). 

km Write as many Vffl-detail items as possible for each paraphrase* 

5* Try to write as least two WH-detail items for each paraphrase 
Notes Requirement fo^ test fonna was six WH--detail items/ 
passage* Passages are very ^ort (5O-8O words) *6 

B. Write three distractors for each Item (i.e*, four responses, 
including distracto^s and correct response )- 

1. Write only gratfnmatically and semaatically plausible 
distractors* 

2* Write parallel distractors when possible. 

5» Write distractors that closely matdi the correct 
response in m^ber of words* 

km Avoid writing responst arrays in which the correct 
re^pnse characteristically standi out because of its 
brevity t length, or s^tax. 

5* Write no distr&ctors that coiild be correct in the context 
of the passage. 
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6, Write dlstraotors that are appropriate to the difficalty 
level of the passage (see II. B* abov©). 



Problems and Re^onses 
A, Paraphrases 

1. Nat every sentence yields an adequatt paraphrase, For 
example, vocabulary levols, uniqueness of vocabulary or 
structure, and other factors may make paraphrasing diffi- 
cult • 

2* When eentences which cannot be acceptably paraphrased 
resiilt in passages which do not yield the requisite 
number of items, select another passage randomly from 
the relevant difficulty level *7 

B« Items 

1s Vtoea item stems contain substantive words verbatim 
from the passage, ma^e sure correct response is not 
verbatim (i.a«| do not write verbatim WH- detail 
items) # 

2* When a correct response is verbatim, miJce sure that 
some distractors ^re also verbatim to dimini^ the 
possibility of ortho^aphic matching* 

3* Wien a correct response is partially verbatim 

(e.g., this occurs occasionally in longer reeponses), 
make sure at least one distractor contains the verbatim 
element which appears in the correct response (to diminish 
ortho^aphic matdtiing)* 
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Footnotes 



Extraeted from context i subordinate Glauses and some phrages may be 
paraphraeed as main clausee or sentenoes. Example; "But even [a liar-s 
invention] » being an empty thing that offers no hold . . •"is paraphrased 
as "a prevarioator*© fiction is a vacuoue thing that provides no handle-' 
for a wb-ltem as follows! ■'Wiat kind of thing is a prevaricator's fiction?" 

Notes An alternate version of a sentence i clause, or phrase which 
"means" what another aentenoei clause, or phrase "means" is not necessarily 
a paraphrase acoording to the rules here presented. Saying a thing in 
another way is not always equivalent to paraphrasing by these rules* 

Such a situation occurs on occasion when a reviewer is dissatisfied 
with an item mtm (or stem plus response) and rewrites the item to make 
it sound better or to avoid heaviness, awkwardness, wordiness, etc,-=but 
without first writing a new paraphrase or without tiding the original 
paraphrase into consideration. The rewritten item, considered out of 
context, will often soimd or look better, but it will often no longer be 
an item baaed on Bn acceptable paraphrase. 

A similar problem arises when an item is rewritten but is no longer 
a WH-detail item* 

Modifiers include adjectives and adverbs, not articles or determiners- 
If 

Superordinate terms are not necessarily acceptable synon^s 
(e«g«i §PK necessarily an acceptable s^onym for Siberian wolf-hound)- 

5 

Sea Riaes for Constructing Vffl-Detail Items, on file with BSGR* 

^Average ni:uTiber of WH-datail items written for each passage was more 
than teni of which six were selected. Criteria for selection were quality 
(e%g#, absence of awkwardnesa and turgidity) and freedom from mutual cueingi 
defined aa a stem giving away a response to another stem* In the fullowing, 
for example, stem A cues the answer to stem Bi "A* When did the fuel drums 
burst into flame?" "B, What burst into name?" 

'Fewer than ten per cant of the passages from the original simple had 
to be replaced* 




